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Abstract  -  Performance  evaluation  of  simultaneous 
tracking  and  identification  (STID)  systems  consists  of 
measures  of  performance,  measures  of  effectiveness,  and 
measures  of  force  effectiveness.  To  investigate  the 
capability  of  STID,  we  extend  track  purity’  to  a  current 
assignment  ratio.  Track  purity  determines  the  correctly 
associated  measurements  for  a  given  scan  where  as  the 
Current  Assignment  Ratio  (CAR)  determines  the  correct 
measurements  for  a  given  track.  Using  the  CAR  aids 
sensor  management  algorithms  in  determining  which 
targets  have  robust  features  for  target  identification  in 
clutter,  such  that  the  target  can  be  tracked.  The  CAR 
enables  effectiveness  evaluation  of  STID  systems  for 
mission  success.  In  the  paper,  we  review  the  fusion 
performance  evaluation  literature,  outline  STID  metrics, 
and  demonstrate  the  use  of  the  CAR  in  a  scenario. 
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1  Introduction 

Seminal  texts  in  data  fusion  [1],  target  tracking  [2,  3],  and 
information  fusion  [4,  5]  have  addressed  performance 
evaluation  (PE);  yet  the  community  as  a  whole  has  yet  to 
adopt  standards  from  which  systems  are  uniformly 
evaluated.  For  example,  the  PE  of  multitarget 
simultaneous  tracking  and  identification  (STID)  methods 
pose  challenges  for  situational  awareness. [6]  STID  is  a 
subset  of  information  fusion  (IF),  which  includes  filtering, 
estimation,  and  prediction  of  data  as  well  as  a  derivative 
evaluation  over  the  context  of  the  operating  conditions  of 
the  sensors,  targets,  and  environments. 

Initial  work  on  tracking  PE  was  focused  on  optimal 
methods  [7]  which  do  not  necessarily  hold  in  a  dynamic 
environment.  Key  developments  and  methods  have  been 
postulated  and  evaluated  by  X.  Li  and  Y.  Bar-Shalom  [8], 
K.  C.  Chang  [9],  and  C-Y.  Chong  and  S.  Mori  [10].  Each 
of  these  approaches  offers  insight  into  the  problem  by 
clarifying  useful  measures  of  performance  (MOPs)  over 
tracking  methods.  We  revisit  the  work  of  the  above 
authors  by  looking  at  track  purity  [11,  12]  and  extending 
the  method  for  the  novel  STID  analysis  by  using  the 
current  assignment  ratio  (CAR)  [13].  Even  in  the  last 
year,  prominent  researchers  are  looking  at  track  purity  as  a 
metric  of  interest  [14],  as  opposed  to  track  lifetime  [15]  to 
understand  the  capability  of  forming  tracks  from 
measurements  in  clutter.  The  contributions  of  this  paper 


are:  a  literature  research  summary  in  tracking  evaluation, 
the  application  of  purity-based  methods  to  STID 
scenarios,  and  a  general  extension  of  how  purity-based 
methods  support  measures  of  effectiveness  (MOEs). 

Use  of  features,  attributes,  and  categorical 
representations  of  targets  has  become  more  prominent  as 
users  (or  analysts)  desire  to  not  only  know  where  the 
target  is,  but  who  it  is,  and  even  more  practical,  what  is  the 
target’s  behavioral  intention.  Initial  use  of  track  and  ID 
methods  sought  to  recognize  landmarks  for  navigation 
[16,  17]  and  distinguish  targets  from  clutter  [18,  19].  By 
simultaneously  processing  target  identification  (who)  and 
target  tracking  (where)  can  have  mutual  benefits  to  both 
reasoning  systems  [20,21].  To  illustrate  how  ID 
information  may  help  in  data  association,  Figure  1 
illustrates  the  process  of  how  a  target-ID  can  refine  the 
positional  measurement  to  select  the  validated 
measurement  from  the  cluttered  measurements.  Numerous 
approaches  in  joint  tracking  and  recognition  (category), 
classification  (type),  and  identification  (fingerprint)  have 
been  applied  using  emerging  techniques  such  as  Joint- 
Belief  Probabilistic  Data  Association  (JBPDA)  algorithm 
[21,  22,  23],  pose-aiding  radar  [24],  DSmT  [25],  and 
particle  filters  [26].  Features  (or  set-based  feature 
combinations)  from  radar  [27,  28,  29],  infrared  [30],  and 
hyperspectral  [31],  data  have  been  assessed.  Recently, 
Dezert  combined  the  proportional  conflict  redistribution 
(PCR)  method  with  an  interactive  multiple-model  (IMM) 
[32].  A  STID  algorithm  can  improve  track  quality, 
mitigate  clutter  confusion,  and  enhance  target  ID.  Inherent 
in  the  research  is  the  capability  to  discern  target  type 
location  from  different  vantage  points  as  fused  from 
distributed  platforms  using  sensor  management.  [33] 


Figure  I.  ID  /  Position  Measurement  Data  Association. 
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Sensor  management  includes  user  placement  and  control 
of  sensors,  automated  processing,  and  visualization  of 
performance.  [34]  Information  fusion  systems  must 
undergo  rigorous  tests  before  being  operationally  ready. 
Thus,  routinely  there  are  efforts  to  describe  the  objective 
and  threshold  metrics  to  guide  testing  of  real-world 
systems.  [35,  36,  37]  To  aid  operational  testing,  numerous 
papers  have  tried  to  categorize  the  metrics  of  interest  for 
not  only  the  information  fusion  processing,  but  the  system 
as  a  whole  [38,  39,  40,  41,  42].  Various  texts  have 
provided  MOEs  in  addition  to  the  MOPs.  [1,4] 

The  tracking  community  has  supplied  numerous  papers 
and  methods  in  tracking  metrics  and  performance 
evaluation  (PE).  Initial  work  includes  performance 
analysis  of  trackers  in  clutter  [43],  dense  targets  [44],  and 
the  algorithms  themselves  [45,  46].  As  the  need  for  track 
evaluation  increased  there  were  papers  that  summarized 
metrics  [47]  and  new  instantiations  of  some  of  the  metrics 
[48],  K.  C.  Chang,  S.  Mori,  and  C-Y.  Chong  [49,  50,  51], 
continued  to  mature  the  techniques  as  well  as  X.  R.  Li 
and  Z.-L.  Zhao  [52,  53,  54].  Numerous  other  examples 
exist  of  reporting  results  of  research  [55],  tracking 
toolboxes  [56],  and  most  recently  issues  of  computational 
costs  and  scalability  [57].  As  an  example,  W.  D.  Blair  lists 
MOPs  of  accuracy,  completeness,  ambiguity,  continuity, 
timeliness,  and  commonality  [58];  which  are  similar  to 
those  proposed  by  Blasch  of  accuracy,  timeliness, 
confidence,  throughput,  and  cost  [6]. 

Performance  evaluation  of  classification  methods  is 
quite  mature  in  the  literature  due  to  the  elements  of  pattern 
recognition,  image  processing,  and  the  security 
surveillance  industry.  PE  of  classifiers  typically  includes 
receiver  operator  curves  (ROCs)  [59]  that  plot  probability 
of  detection  versus  probability  of  false  alarms.  Advances 
from  information  fusion  include:  confusion  matrix 
analysis  [60,  61],  applications  of  Bayesian  [62]  and 
Dempster-Shafer  methods  [63],  and  on-line  tools  [64]. 

Similar  to  results  from  tracking  and  identification,  there 
is  a  need  for  MOPs  for  situational  awareness  (SA).  SA 
includes  situational  assessment  and  threat  analysis  with 
cues  and  inputs  from  users.  Metrics  and  evaluation  is  not 
yet  a  well-established  area  of  research,  but  can  build  from 
the  developments  of  the  tracking  and  classification 
communities.  Examples  include  user  involvement  [65], 
situational  awareness  tools  [66],  and  high-level  MOEs 
[67].  Next  we  describe  the  measures  of  merit. 


•  MOEs:  focus  on  the  impact  of  C2  systems  within  the 

operational  context. 

•  MOPs:  focus  on  the  internal  system  structure, 

characteristics  and  behaviour.  MOPs  of  a  system  may  be 
reduced  to  measures  based  on  time,  accuracy,  capacity  or 
a  combination  that  may  be  interdependent. 

•  Dimensional  Parameters:  are  the  properties  or 

characteristics  inherent  in  the  physical  C2  systems. 

Since  the  boundaries  between  the  different  levels  can  be 
quite  fuzzy,  this  hierarchy  provides  rough  divisions  of  a 
continuum  of  scales  of  observations,  and  serves  as  a 
guideline  for  the  evaluation  process.  Some  authors  [37] 
also  suggest  two  other  levels: 

•  The  measure  of  military  utility,  tries  to  remove  some  of  the 

scenario  dependency  of  the  measures. 

•  A  measure  of  policy  effectiveness ,  measures  the  worth  of 

operations.  Sometimes  a  successful  mission  is  not  a 
guarantee  of  overall  success,  i.e.,  winning  a  battle  does  not 
necessarily  mean  that  the  war  will  be  won. 

•  A  measure  of  Command  and  Control  effectiveness,  which 

measures  the  decision  support  capabilities. 

Figure  2(a)  shows  the  encircling  relationship  of  the 
metrics  and  Figure  2(b)  shows  a  balanced  approach  for 
metric  analysis  important  to  the  user. 


Figure  2.  MOPs  and  MOEs  relations. 


Figure  3  highlights  the  summary  from  the  NATO  Code  of 
Best  Practices  for  C2  Assessment  [68]  which  highlights 
the  importance  of  the  metrics  as  well  as  the  tradeoffs. 
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1.1  Measures  of  Merit 

For  complex  Command  and  Control  (C2)  systems,  the 
merit  of  the  system  can  be  established  at  various  levels  of 
observations.  The  Militaiy  Operations  Research  Society 
(MORS)  developed  a  hierarchy  of  MOMs  [35]  for 
Command,  Control  and  Information  Systems  (C2IS)  that 
can  be  summarized  as  follows: 

•  Measures  of  Force  Effectiveness  (MOFEsl:  focus  on  how  a 
force  performs  its  mission  or  the  degree  to  which  it  meets 
its  objectives. 


Figure  3.  MOM  tradeoffs  (from  C.  Wallshein  [68]). 

Figure  4  develops  the  MOMs  in  relation  to  the  C3I  system 
as  a  whole.  Effectiveness  is  based  on  function,  structure, 
and  capability.  The  NATO  Code  of  Best  Practices  for  C2 
Assessment  also  describes  methods  of  evaluation  through 
tests  and  scenarios  of  interest.  The  MOEs  afford  speed  of 
data  analysis,  efficiency  in  communication,  and  risk 
reduction  (or  safety)  from  threats. 
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C3I  System 


Figure  4.  MOEs  for  C3I  testing  (from  S.  H.  Stan-  [68]). 


We  have  summarized  key  developments  in  fusion 
performance  evaluation  over  tracking,  classification,  and 
system  level  analysis.  Research  over  the  entire  spectrum 
would  be  longer  than  a  paper;  so  we  highlight  one  aspect 
in  the  scope  of  the  larger  efforts.  Here,  we  focus  on 
extending  the  track  purity  MOP  to  afford  MOE  analysis 
for  track  and  ID  systems.  Section  2  is  a  summary  of 
performance  evaluation  with  developments  of  track  purity 
and  the  Current  Assignment  Ratio  (CAR).  Section  3 
briefly  describes  the  belief  track  and  ID  method  in  a 
scenario  to  demonstrate  the  use  of  the  CAR.  Section  4 
provides  a  conclusion  and  discussion. 

2  Performance  Evaluation 

The  recommended  Measures  of  Performance  (MOPs) 
quantify  the  following: 

•  Information  accuracy:  evaluates  the  quality  of  the 

positional  tracking  of  the  ground  truth  platforms  in 
terms  of  the  positional  accuracy,  the  track  purity,  and 
the  correct  assignment  ratio. 

•  Information  consistency:  looks  at  the  coherence  in  the 

information  between  a  sensor’s  database  and  the  task 
coordinator  database  and  the  coherence  between  the 
organic  and  the  non-organic  system  data.  When  a 
bad  ID  is  associated  to  a  track  the  inconsistency  in 
the  information  can  manifest  itself  as  a  track 
switching.  Track  switch  inconsistency  can  be 
identified  with  track  purity,  correct  assignment  ratio, 
and  track  continuity. 

•  Picture  clarity:  addresses  the  expected  enhancement  in 

terns  of  object  identification  as  well  as  the  system 
robustness  to  problematic  sensor  information 
generating  false,  redundant,  or  spurious  tracks. 

•  Picture  completeness:  evaluates  how  much  of  the  real 

world  the  system  knows.  For  the  purpose  of  this 
analysis  the  real  world  is  reduced  to  a  specified 
region  of  space  (the  volume  of  interest,  VOI)  during 
a  given  time  interval  (the  time  interval  of  interest). 

•  Track  management  statistics:  permit  an  evaluation  of 

how  well  the  system  behaves  in  real  time.  The  load 
on  the  computer  is  measured  in  terms  of  the  number 
of  tracks  and  objects  it  has  to  process  and  the  time  it 
takes  to  execute  the  different  operations.  Another 
question  addressed  by  track  management  statistics  is: 
how  well  handover  of  a  track  from  a  sensor  to  the 


other  is  performed  in  both  systems.  This  is  done  by 
comparing  the  time  and  modality  (manual, 
automatic)  of  track  deletion  with  track  continuity. 

Some  of  these  measures  are  performed  on  a  single 
track  (or  ground  truth  platform)  for  an  analysis  of  the 
information  stability.  Others  are  statistical  measures  based 
on  many  tracks/ground  truth  platforms  and  are  used  to 
establish  an  average  system  behaviour.  A  Measure  of 
Force  Effectiveness  (MOFE),  the  model-based  measure,  is 
also  proposed  as  an  overall  estimator  of  the  system  value. 

A  better  understanding  of  the  system  performance 
will  be  gained  if  an  effort  is  made  to  partition  the  different 
measures  according  to  the  type  of  tracks  and  region  of 
interest  (air,  surface,  underwater). 

2.1  Track  Purity  and  CAR 

Track  purity  (TP),  a  concept  coined  by  Mori  el.  al.  [11], 
assesses  the  percent  of  correctly  associated  measurements 
in  a  given  track,  and  so  evaluates  the  association/tracking 
performance.  The  track  purity  MOP  is  not  explicitly 
dependent  on  detection  performance,  but  it  is  dependent 
on  the  setting  of  association  gates  (which  depends  on  the 
probability  of  detection  Pd)  and  the  ground  truth  platform 
density.  Track  purity  measures  the  consistency  with 
which  a  track  is  updated  with  measurements  from  a  single 
ground  truth  platform  or  a  set  of  ground  truth  platforms. 

Correctional  local  MOPs,  such  as  track  purity,  measure 
how  well  the  tracks  in  an  IF  system  are  being  associated 
with  measurements  of  ground  truth  platforms.  The  track 
purity  MOP  is  based  on  the  calculation  of  a  confusion 
matrix  C  for  which  the  elements  C;,  are  constructed  by 
counting  reports.  Given  the  tracks  t/ ,  ....  4  and  a  set  of 


ground  truth  platforms  g/ ,  . 

.,ga,C  is: 
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8  i 
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Here,  C),  is  the 

number  of  reports 

originating 

ground  truth  platforms  g,  which  were  assigned  to  track  tj  (i 
=  1,  a\j  =  1,  b)  by  the  IF  algorithm.  Also,  Co,  (the 
“ambiguity  vector”)  consists  of  the  number  of  reports  that 
could  not  be  assigned  to  any  ground  truth  platform  (i  =  1, 
...,  a).  When  CJi  is  large,  a  strong  association  between  tj 
and  g,  is  implied. 

Track  purity  is  defined  as  the  percentage  of  correctly 
associated  measurements  contained  in  a  given  track.  The 
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purity  of  the  track  tj  is  defined  as  the  normalised  value  of 
the  largest  element  in  the  row  defined  by  tf. 

max  C„ 

_  1  <i<a  J  (  ]  ) 

a 

EC, 

l=l 

The  TP  measure  can  be  estimated  for  each  single  track, 
but  is  more  meaningful  when  statistics  of  the  TP  quantity 
are  calculated.  A  recommended  statistic  is  the  Weighted 
Average  of  Track  Purity  (WATP)  taken  over  all  tracks  and 
ground  truth  platforms.  The  WATP  statistic  should  be 
calculated  separately  for  each  type  of  track  for  air,  surface 
and  underwater  platforms.  It  has  a  particularly  convenient 
form  if  the  weight  given  to  each  track  is  the  number  of 
measurements  for  that  track,  and  if  the  weight  given  to 
each  ground  truth  platform  is  the  number  of  measurements 
originating  from  that  ground  truth  platform.  The  resulting 
definition  of  the  WATP  is  as  follows: 

b 

I  max,  cfl 

WA  TP  =  ^ -  (2) 

b  a 

lie, 

j= 1  i=l 

The  following  elements  are  needed  to  compute  Track 
Purity  (TP)  or  WATP: 

a.  The  list  of  correct  (CO)  track  numbers  for  which  TP  will 
be  computed  (provided  by  the  operator), 

b.  For  each  CO  track  pertaining  to  the  selected  CO  track 
number,  one  needs  the  CO  track  number,  the  valid  time 
and  the  ground  truth  platform  number  to  which  the  CO 
track  is  attached,  and 

c.  For  each  ground  truth  element  corresponding  to  any 
ground  truth  platform  number  present  in  the  selected  CO 
tracks,  one  needs  the  time  stamp  and  the  ground  truth 
platform  number. 

The  confusion  matrix  is  the  starting  point  of  many  MOPs 
and  its  construction  requires  a  lot  a  computation. 
Basically,  we  have  to  associate  each  CO  Track  report  to  a 
target  in  the  ground  truth.  The  choice  of  association  can 
be  made  by  a  function  of  association  that  we  will  name 
Associate.  This  function  will  take  as  argument  a  track  T  at 
the  time  t,  and  the  complete  lists  of  tracks  and  ground 
truth’s  targets.  Associate  can  be  driven  by  positional 
and/or  ID  data.  The  resulting  confusion  matrix  depends  on 
the  function  Associate  and  it  can  be  useful  to  test  the 
related  MOPs  with  some  variations  of  Associate.  Here  is  a 
procedure  to  construct  the  confusion  matrix  that  uses 
Associate: 

a.  Collect  data  to  have  all  CO  track  reports  for  each  track  and 

each  history  point  of  all  targets  in  the  ground  truth, 

b.  Initialize  the  confusion  matrix  by  filling  each  entry  with 

zeros, 

c.  For  each  track,  process  all  CO  track  reports  by: 


1)  Using  the  association  function  Associate,  to  find  the 

corresponding  target  in  the  ground  truth,  and 

2)  Adding  1  to  the  related  entry  of  the  confusion  matrix. 

So  the  given  algorithm  can  be  automated  if  the 
association  function  is  feasible.  First,  the  function 
Associate  needs  all  the  data  aligned  in  time  with  the  given 
CO  track  report.  This  can  take  a  lot  of  computing  time 
since  it  is  proportional  to  the  number  of  tracks  and  targets. 
The  first  step  can  be  computed  automatically  without 
difficulty.  The  second  step  is  to  determine  and  use  an 
association  criterion  that  will  select  a  target  from  the  list  of 
all  targets  in  the  ground  truth.  The  second  step  can  also  be 
automated  since  we  can  always  select  a  target  and,  by 
hypothesis,  the  target  list  is  not  empty.  The  criterion  can 
be  based  on  position,  ID,  or  both.  Investigation  has  to  be 
made  to  find  the  best  criterion.  Here,  we  will  give  some 
examples  of  criteria. 

a.  A  rapid  and  easily  implementable  criterion  is  to  choose  the 
target  that  is  the  closest  to  the  CO  track.  It  is  fast  since  it 
only  proportional  to  the  number  of  targets.  However  it  can 
lead  to  erroneous  results  like  associating  all  tracks  with  the 
same  ground  truth. 

b.  A  better  criterion  is  to  use  a  Nearest  Neighbour  or  a  JVC 
association  algorithm  which  requires  more  computation 
since  it  is  proportional  to  the  number  of  tracks  and  the 
number  of  targets.  Based  on  position  and/or  ID,  an 
intermediate  association  matrix  has  to  be  created  to  find 
the  right  association.  Since  each  track  belongs  to  a  target 
in  the  ground  truth,  there  is  no  problem  when  the  number 
of  tracks  is  lower  or  equal  to  the  number  of  targets.  The 
problem  occurs  when  the  number  of  tracks  is  greater  than 
the  number  of  targets  (this  may  happen  when  a  lot  of 
spurious  tracks  are  present).  It  would  cause  the  algorithm 
to  be  unable  to  associate  the  CO  track  with  a  target. 

The  Current  Assignment  Ratio  (CAR)  [13]  measures  the 
performance  for  a  ground  truth  platform  instead  of 
measuring  the  performance  for  a  track.  The  CAR  MOP 
for  ground  truth  platform  g,  is  defined  as  the  normalized 
value  of  the  largest  element  in  the  column  defined  by  g, 
(i.e.,  by  an  analogous  equation  to  TP,  but  maximising  and 
summing  over  columns  rather  than  rows).  It  assesses  the 
percentage  of  contacts  from  a  ground  truth  platfonn 
associated  with  the  correct  track. 

max  C .. 

ci/i(f::,=ry —  o, 

SU 

M 

The  higher  the  value  of  TP  and  CAR,  the  better  the 
association  performance  is.  Both  measures  are  important 
and  both  should  be  measured,  since  the  matrix  C,  has  no 
special  symmetry.  Since  the  requirements  to  obtain  this 
measure  are  the  same  as  for  TP,  the  difficulty  and 
relevance  to  the  present  study  are  the  same. 
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The  CAR  could  be  measured  for  a  single  ground  truth 
platform,  but  a  statistical  quantity  for  many  ground  truth 
platforms  is  more  significant.  A  useful  statistic  is  the 
weighted  average  of  CAR  taken  over  all  ground  truth 
platfonns  (WACAR).  It  has  a  particularly  convenient  form 
if  the  weight  given  to  each  track  is  the  number  of 
measurements  for  that  track,  and  if  the  weight  given  to 
each  ground  truth  platform  is  the  number  of  measurements 
originating  from  that  ground  truth  platform.  The  resulting 
definition  of  the  WACAR  is  as  follows: 


WACAR  = 


Ima  x,C,,. 

i±cj, 


i= 1  M 


(4) 


An  example  is  presented  in  Figure  4: 

X:  contact  from  Platform  I 
O:  contact  from  Platform  2 


Figure  5.  Example  of  object-to-track  Association 
In  this  case  the  confusion  matrix  is: 


Platform  1 

Platform  2 

Track  A 

7 

1 

Track  B 

2 

6 

From  this  matrix  the  track  purity  is  calculated  as  TP(A)  = 
7/8  =  87.5%  and  TP(B)  =  6/8  =  75%  and  WATP  = 
(7+6)/(7+ 1+2+6)  =  13/16  =  81.3%.  Similarly  the  correct 
assignment  ratio  is  CAR(l)  =  7/9  =  77.8%  and  CAR(2)  = 
6/7  =  85.7%  WACAR  =  (7+6)/(7+2+l+6)  =13/16=81.3%. 

2.2  Summary  of  Measures 

The  problem  of  track  level  and  ID-level  fusion  has 
characteristic  tradeoffs  about  which  the  sensor 
management  system  must  arbitrate  over  a  given  scenario. 
Here  we  list  some  of  the  salient  metrics. 

Measures  of  Performance 

1 .  Information  Accuracy 

Positional  Accuracy 
Track  Purity 

Correct  Assignment  Ratio 
Accuracy  of  the  filter  covariance 

2.  Information  Consistency 

Proportion  of  target  groups  recognized 
Proportion  of  recognized  SA  ROI  in  organic  SA 

3.  Information  Currency 

Time  in  VOI  Prior  to  Detection 
Time  from  Detection  to  Confirmation 
Data  throughput 

4.  Situational  Clarity 


Time  of  Positive  Classification  /  Positive  Identification 
Probability  of  detection  and  False  Alarm  Rate  (FAR) 

Target  Confidence 

Accuracy  of  Bayesian  Percent  Attribute  Miss  (BPAM) 
Spurious  Track  Mean  Ratio 
Target  Track  Exchange  Rate 

5.  Situational  Completeness 

Completeness  History 
Value  -  area  coverage 

6.  Track  Management  Statistics 

Time  of  track  initiation  /  deletion 
Track  Continuity  /  Track  Lifetime 
Track  swaps,  broken  tracks 
Real-Time  system  Parameters 

Measures  of  Effectiveness 

1 .  Scenario  Measures 

Timeliness  of  information 

Survivability  as  a  function  of  detected  targets 

2.  Threat  evaluation 

Percentage  of  Targets  Correctly  Assessed 
Target  Nomination  Rate 
Degree  of  Exactness  in  the  Threat  List  Ranking 
Information  Gain 

Protection  (Protection  =  1  -  Threat  Level) 

2.  Decision  Support 
Usability 

Surveillance  picture 

Safety  over  all  target  position/IDs/intents  (Safety  =  1  -  risk) 

Measures  of  Force  Effectiveness 

1 .  Resource  Management 

Response  Time  over  network  communications 
Time  between  target  confirmation  and  weapon  release 
Cost  of  Battle  over  resources  and  people 

2.  Command-Level  Support 

Mission  Analysis 

Interoperability  to  send  and  receive  contextual  data 
(see  also  the  summary  from  Llinas,  [4]) 

3  Track  and  ID  Scenario 

3.1  Problem  Formulation 

Consider  an  environment  in  which  a  tracker  is  monitoring 
multiple  moving  targets  with  stationary  clutter.  By 
assumption,  the  tracking  sensor  is  able  to  detect  target 
signatures.  Assume  that  the  2-D  region  is  composed  of  T 
targets  with /features.  Dynamic  target  measurements  z  are 
taken  at  time  steps  k,  which  include  target  kinematic  and 
identification  features  z(k)  =  [ x,(k ),  /],...  /„].  A  final 
decision  from  the  STID  algorithm  is  rendered  as  to  which 
[x,  y]  measurement  is  associated  with  the  target-type. 

The  multisensor-multitarget  tracking  and 
identification  problem  is  to  determine  which  measured 
kinematic  features  should  be  associated  with  which  ID 
features  in  order  to  optimize  the  probability  that  targets  are 
tracked  and  identified  correctly  after  z  measurements.  The 
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multilevel  feature  fusion  problem  is  formulated  and  solved 
by  using  concepts  developed  using  the  belief  filter  [21].  In 
the  belief  filter,  the  "association  rule"  uses  the 
measurement  with  the  highest  target  probability  in  a  joint- 
belief  probability  data  association  (JBPDAF).  The  ID 
information  is  a  result  of  the  fused  classification  results 
where  the  aggregated  classification  is  over  20  degree 
window  pose  measurements  for  the  various  targets. 


3.2  Scenario  and  Track  and  ID  Results 


As  detailed  in  the  Figure  6  below,  by  the  true  trajectory; 
the  targets  1)  start  with  position  and  velocity,  2)  pass  by 
each  other  at  a  close  distance,  and  3)  finish  with  a 
specified  direction.  There  was  added  noise  to  the  true 
target  position  and  clutter  comprised  of  5  spurious 
measurements  around  a  target.  While  we  could  use  ID- 
derived  results  from  our  previous  work  in  radar  [21], 
EO/IR  [30],  or  HSI  [31];  we  chose  the  results  reported  in 
[30]  for  the  derivation  of  the  true  ID  and  clutter  as  a 
function  of  pose.  Likewise  the  JBDAF  [21,  30]  determines 
the  belief  in  the  target  type  amongst  clutter. 


x  104 


Measurements  (with  clutter) 


Figure  6.  Track  Scenario  with  Clutter  Measurements. 


Figure  7  shows  the  track-only  effectiveness  when 
targets  do  not  have  ID  information.  The  separation  allows 
for  the  determination  of  a  validation  gate  size  that 
associates  the  correct  measurements  to  tracks.  However, 
as  targets  are  close,  the  tracker  has  track  switches  as 
measurements  from  one  track  and  assigned  to  a  different 
track.  Confusion  Matrix,  TP,  and  CAR  for  the  entire  run: 


1  in  the  beginning,  while  Track  3  has  a  large  overlap 
period  from  which  there  is  confusion  with  Track  2.  The 
key  here  is  that  TP  shows  the  incorrect  assignment  of 
measurements  ( Track  confusion)  while  CAR  demonstrates 
the  how  the  target  can  be  confused  with  the  other  targets 
(ID  confusion).  Since  a  MOE  includes  situational 
awareness,  both  track  and  ID  information  is  required. 


Figure  7.  Tracking  without  ID  information. 


Figure  8  shows  that  TP,  CAR,  WATP,  and  WACAR  are 
improved  with  a  ST1D  system. 


Figure  8.  Tracking  with  ID  results. 


Track  1 

Track  2 

Track  3 

Track  Purity 

0.975 

0.995 

0.975 

CAR 

0.980 

0.996 

1.000 

3.3  Measures  of  Performance 


C  = 


197  3  0 

5  193  2 

5  21  174 _ 


Track  1 

Track  2 

Track  3 

Track  Purity 

0.985 

0.965 

0.870 

CAR 

0.952 

0.889 

0.989 

Note  that  WATP  =  0.940.  From  these  results  we  see 
that  Track  2  has  a  high  purity  but  lower  CAR.  Track  3 
exhibits  the  opposite  analysis  with  a  low  purity  and  high 
CAR.  Track  1  moves  in  linear  fashion,  while  Track  2  and 
3  are  maneuvering.  Track  2  has  a  short  overlap  with  Track 


In  an  effort  to  do  an  analysis  for  MOEs,  we  plot  a  spider 
chart  of  the  other  metrics  (normalized  to  1  over  the  results 
for  each  metric),  as  shown  in  Figure  9.  Space  limits  a 
detailed  analysis;  however  (1)  accuracy  is  high  because 
the  tracks  are  known,  (2)  confidence  is  from  the  ID 
information,  (3)  timeliness  is  from  the  measurement 
reporting,  (4)  throughput  is  from  the  usefulness  of  the  data 
(good  measurements),  and  (5)  value  is  related  to  the 
opportunity  cost.  Given  the  area  of  coverage,  the  value  of 
the  situational  analysis  requires  analyzing  the  entire  space. 
Here  it  is  lower  due  to  the  scenario  in  which  the  analysis 
of  the  closely  spaced  targets  requires  a  focused  attention 
of  the  sensors  while  giving  up  entire  area  coverage. 
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Meas  ures  of  Performance 


Figure  9.  Measures  of  Performance. 

MOEs  are  also  important  as  related  to  the  preference  of 
the  user  to  have  a  surveillance  picture  of  the  region  of 
coverage,  as  shown  in  Figure  10.  1 


Measures  of  Effectiveness 


Figure  10.  Measures  of  Effectiveness. 


Timeliness  is  similar  to  the  MOP,  yet  this  is  for  the  MOE 
information.  Survivability  is  related  to  the  detection  of  the 
targets,  and  safety  (Safety  =  1  —  Risk)  is  related  to 
knowing  all  the  target  types  and  identities.  Likewise, 
protection  (Protection  =  1  -  Threat)  is  related  to  the  area  of 
coverage.  Interoperable  is  related  to  the  communication 
and  delivery  of  the  correct  target/track  information.  Note 
that  with  a  limited  simulation,  the  normalized  metrics 
highlight  any  discrepancies.  For  instance,  over  200 
position  measurements  there  are  numerous  instances  for 
timeliness  and  survivable  updates.  However,  safety, 
protection,  and  interoperable  are  simulated  as  related  to 
the  track  (versus  each  measurement).  We  will  explore 
these  metrics  in  future  papers  but  they  are  presented  for 
discussion. 

4  Discussion  &  Conclusions 

Performance  evaluation  of  classification  results  and  target 
tracking  techniques  has  posed  difficulty  in 
standardization.  We  presented  numerous  efforts  to 
develop  metrics  and  tools  for  evaluation.  To  further 
extend  knowledge  in  the  area,  we  looked  at  the  TP  metric 
and  determined  that  it  could  be  used  for  higher-level 


1  Note  that  these  are  notional  MOEs;  however,  since  many  MOEs  are  left 
undefined,  there  is  no  standard  for  consistency  for  commercial  or  public 
development  to  ensure  the  safety  of  targeted  information. 


fusion  analysis  as  supporting  MOEs.  We  examined  the 
ST1D  scenario  using  the  belief  filter  to  highlight  the 
usefulness  of  a  Current  Assignment  Ratio  in  addition  to  a 
Track  Purity  metric. 

The  research  aim  is  for  system-level  information  fusion 
evaluation  over  the  sensor,  target,  and  environmental 
operating  conditions  and  variations.  We  initiated  a 
discussion  on  the  presentation  of  all  MOPs  and  MOEs  in  a 
spider  plot  for  a  user  to  grasp  the  complete  performance  of 
the  STID  system.  Future  work  will  explore  the  sensitivity 
of  the  results,  the  presentation  of  the  MOE  metrics,  and 
use  of  operational  data  to  validate  the  approach.  For 
example,  we  seek  to  explore  sensor  management,  image 
fusion,  and  terrain  updates  [69]  as  impacting  the  MOEs. 
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